A Novel Nonparallel Plane Proximal SVM for Imbalance Data Classification

نویسندگان

  • Bing Yang
  • Ling Jing
چکیده

The research of imbalance data classification is the hot point in the field of data mining. Conventional classifiers are not suitable to the imbalanced learning tasks since they tend to classify the instances to the majority class which is the less important class. This paper pays close attention to the uniqueness of uneven data distribution in imbalance classification problems. Without change the original imbalance training data, this paper indicated the advantages of proximal classifier for imbalance data classification. In order to improve the accuracy of classification, this paper proposed a new model named LSNPPC, based the classical proximal SVM models which find two nonparallel planes for data classification. The LS-NPPC model is applied to six UCI datasets and one real application. The results indicate the effectiveness of the proposed model for imbalanced data classification problems.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Sparse Twin SVM for multi-classification problems

We propose Sparse TSVM, a multi-class SVM classifier that determines k nonparallel planes by solving k related SVM-type problems. The Sparse TSVM promotes Twin SVM to one-versus-rest approach. And it capture classes' main feature better with the sparse algorithm. On several benchmark data sets, Sparse TSVM is not only fast, but shows good generalization.

متن کامل

Detection of Horizontal Gene Transfer in Bacterial Genomes

Most bacterial genes were acquired by horizontal gene transfer (HGT) from other prokaryotic organisms instead of being inherited by continuous vertical descent from an ancient ancestor. HGT is generally believed to be a major factor in microbiology evolution, allowing rapid diversification and adaptation. In this paper, we artificially simulate HGT by inserting phage genes into bacterial genome...

متن کامل

Enhancing the Performance of SVM on Skewed Data Sets by Exciting Support Vectors

In pattern recognition and data mining a data set is named skewed or imbalanced if it contains a large number of objects of certain type and a very small number of objects of the opposite type. The imbalance in data sets represents a challenging problem for most classification methods, this is because the generalization power achieved for classic classifiers is not good for skewed data sets. Ma...

متن کامل

Fuzzy Least Squares Twin Support Vector Machines

Least Squares Twin Support Vector Machine (LSTSVM) is an extremely efficient and fast version of SVM algorithm for binary classification. LSTSVM combines the idea of Least Squares SVM and Twin SVM in which two nonparallel hyperplanes are found by solving two systems of linear equations. Although, the algorithm is very fast and efficient in many classification tasks, it is unable to cope with tw...

متن کامل

Parallel selective sampling method for imbalanced and large data classification

Several applications aim to identify rare events from very large data sets. Classification algorithms may present great limitations on large data sets and show a performance degradation due to class imbalance. Many solutions have been presented in literature to deal with the problem of huge amount of data or imbalancing separately. In this paper we assessed the performances of a novel method, P...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • JSW

دوره 9  شماره 

صفحات  -

تاریخ انتشار 2014